cnn classifier
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks
With the rapid development of deep learning, the increasing complexity and scale of parameters make training a new model increasingly resource-intensive. In this paper, we start from the classic convolutional neural network (CNN) and explore a paradigm that does not require training to obtain new models. Similar to the birth of CNN inspired by receptive fields in the biological visual system, we draw inspiration from the information subsystem pathways in the biological visual system and propose Model Disassembling and Assembling (MDA). During model disassembling, we introduce the concept of relative contribution and propose a component locating technique to extract task-aware components from trained CNN classifiers. For model assembling, we present the alignment padding strategy and parameter scaling strategy to construct a new model tailored for a specific task, utilizing the disassembled task-aware components.The entire process is akin to playing with LEGO bricks, enabling arbitrary assembly of new models, and providing a novel perspective for model creation and reuse. Extensive experiments showcase that task-aware components disassembled from CNN classifiers or new models assembled using these components closely match or even surpass the performance of the baseline,demonstrating its promising results for model reuse. Furthermore, MDA exhibits diverse potential applications, with comprehensive experiments exploring model decision route analysis, model compression, knowledge distillation, and more.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
- (2 more...)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (3 more...)
- Information Technology > Security & Privacy (0.46)
- Health & Medicine > Therapeutic Area (0.46)
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks
With the rapid development of deep learning, the increasing complexity and scale of parameters make training a new model increasingly resource-intensive. In this paper, we start from the classic convolutional neural network (CNN) and explore a paradigm that does not require training to obtain new models. Similar to the birth of CNN inspired by receptive fields in the biological visual system, we draw inspiration from the information subsystem pathways in the biological visual system and propose Model Disassembling and Assembling (MDA). During model disassembling, we introduce the concept of relative contribution and propose a component locating technique to extract task-aware components from trained CNN classifiers. For model assembling, we present the alignment padding strategy and parameter scaling strategy to construct a new model tailored for a specific task, utilizing the disassembled task-aware components.The entire process is akin to playing with LEGO bricks, enabling arbitrary assembly of new models, and providing a novel perspective for model creation and reuse.
Sequence Analysis Using the Bezier Curve
Murad, Taslim, Ali, Sarwan, Patterson, Murray
The analysis of sequences (e.g., protein, DNA, and SMILES string) is essential for disease diagnosis, biomaterial engineering, genetic engineering, and drug discovery domains. Conventional analytical methods focus on transforming sequences into numerical representations for applying machine learning/deep learning-based sequence characterization. However, their efficacy is constrained by the intrinsic nature of deep learning (DL) models, which tend to exhibit suboptimal performance when applied to tabular data. An alternative group of methodologies endeavors to convert biological sequences into image forms by applying the concept of Chaos Game Representation (CGR). However, a noteworthy drawback of these methods lies in their tendency to map individual elements of the sequence onto a relatively small subset of designated pixels within the generated image. The resulting sparse image representation may not adequately encapsulate the comprehensive sequence information, potentially resulting in suboptimal predictions. In this study, we introduce a novel approach to transform sequences into images using the B\'ezier curve concept for element mapping. Mapping the elements onto a curve enhances the sequence information representation in the respective images, hence yielding better DL-based classification performance. We employed different sequence datasets to validate our system by using different classification tasks, and the results illustrate that our B\'ezier curve method is able to achieve good performance for all the tasks.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > New Finding (0.66)
- Research Report > Promising Solution (0.48)
Introspective Classification with Convolutional Nets
Long Jin, Justin Lazarow, Zhuowen Tu
We propose introspective convolutional networks (ICN) that emphasize the importance of having convolutional neural networks empowered with generative capabilities. We employ a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our ICN tries to iteratively: (1) synthesize pseudo-negative samples; and (2) enhance itself by improving the classification. The single CNN classifier learned is at the same time generative -- being able to directly synthesize new samples within its own discriminative model. We conduct experiments on benchmark datasets including MNIST, CIFAR-10, and SVHN using state-of-the-art CNN architectures, and observe improved classification results.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > New York (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Hybrid Quantum Neural Network Advantage for Radar-Based Drone Detection and Classification in Low Signal-to-Noise Ratio
In this paper, we investigate the performance of a Hybrid Quantum Neural Network (HQNN) and a comparable classical Convolution Neural Network (CNN) for detection and classification problem using a radar. Specifically, we take a fairly complex radar time-series model derived from electromagnetic theory, namely the Martin-Mulgrew model, that is used to simulate radar returns of objects with rotating blades, such as drones. We find that when that signal-to-noise ratio (SNR) is high, CNN outperforms the HQNN for detection and classification. However, in the low SNR regime (which is of greatest interest in practice) the performance of HQNN is found to be superior to that of the CNN of a similar architecture.
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Germany (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.70)
A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting
Li, Yuang, Li, Yinglu, Zhang, Min, Su, Chang, Ren, Mengxin, Qiao, Xiaosong, Zhao, Xiaofeng, Piao, Mengyao, Yu, Jiawei, Lv, Xinglin, Ma, Miaomiao, Zhao, Yanqing, Yang, Hao
End-to-end automatic speech recognition (ASR) systems often struggle to recognize rare name entities, such as personal names, organizations, and terminologies not frequently encountered in the training data. This paper presents Contextual Biasing Whisper (CB-Whisper), a novel ASR system based on OpenAI's Whisper model that can recognize user-defined name entities by performing open-vocabulary keyword-spotting (OV-KWS) using the hidden states of Whisper encoder. The recognized entities are used as prompts for the Whisper decoder. We first propose a multitask training approach with OV-KWS and ASR tasks to optimize the model. Experiments show that this approach substantially improves the entity recalls compared to the original Whisper model on Chinese Aishell hot word subsets and two internal code-switch test sets. However, we observed a slight increase in mixed-error-rate (MER) on internal test sets due to catastrophic forgetting. To address this problem and use different sizes of the Whisper model without finetuning, we propose to use OV-KWS as a separate module and construct a spoken form prompt to prevent hallucination. The OV-KWS module consistently improves MER and Entity Recall for whisper-small, medium, and large models.
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
- Asia > China (0.04)
Machine Learning and Computer Vision Techniques in Continuous Beehive Monitoring Applications: A survey
Bilik, Simon, Zemcik, Tomas, Kratochvila, Lukas, Ricanek, Dominik, Richter, Milos, Zambanini, Sebastian, Horak, Karel
Wide use and availability of the machine learning and computer vision techniques allows development of relatively complex monitoring systems in many domains. Besides the traditional industrial domain, new application appears also in biology and agriculture, where we could speak about the detection of infections, parasites and weeds, but also about automated monitoring and early warning systems. This is also connected with the introduction of the easily accessible hardware and development kits such as Arduino, or RaspberryPi family. In this paper, we survey 50 existing papers focusing on the methods of automated beehive monitoring methods using the computer vision techniques, particularly on the pollen and Varroa mite detection together with the bee traffic monitoring. Such systems could also be used for the monitoring of the honeybee colonies and for the inspection of their health state, which could identify potentially dangerous states before the situation is critical, or to better plan periodic bee colony inspections and therefore save significant costs. Later, we also include analysis of the research trends in this application field and we outline the possible direction of the new explorations. Our paper is aimed also at veterinary and apidology professionals and experts, who might not be familiar with machine learning to introduce them to its possibilities, therefore each family of applications is opened by a brief theoretical introduction and motivation related to its base method. We hope that this paper will inspire other scientists to use machine learning techniques for other applications in beehive monitoring.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > Czechia > South Moravian Region > Brno (0.04)
- South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
- (10 more...)
- Health & Medicine > Consumer Health (1.00)
- Food & Agriculture > Agriculture (0.88)
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals
Kadiri, Sudarsana Reddy, Javanmardi, Farhad, Alku, Paavo
Prior studies in the automatic classification of voice quality have mainly studied the use of the acoustic speech signal as input. Recently, a few studies have been carried out by jointly using both speech and neck surface accelerometer (NSA) signals as inputs, and by extracting MFCCs and glottal source features. This study examines simultaneously-recorded speech and NSA signals in the classification of voice quality (breathy, modal, and pressed) using features derived from three self-supervised pre-trained models (wav2vec2-BASE, wav2vec2-LARGE, and HuBERT) and using a SVM as well as CNNs as classifiers. Furthermore, the effectiveness of the pre-trained models is compared in feature extraction between glottal source waveforms and raw signal waveforms for both speech and NSA inputs. Using two signal processing methods (quasi-closed phase (QCP) glottal inverse filtering and zero frequency filtering (ZFF)), glottal source waveforms are estimated from both speech and NSA signals. The study has three main goals: (1) to study whether features derived from pre-trained models improve classification accuracy compared to conventional features (spectrogram, mel-spectrogram, MFCCs, i-vector, and x-vector), (2) to investigate which of the two modalities (speech vs. NSA) is more effective in the classification task with pre-trained model-based features, and (3) to evaluate whether the deep learning-based CNN classifier can enhance the classification accuracy in comparison to the SVM classifier. The results revealed that the use of the NSA input showed better classification performance compared to the speech signal. Between the features, the pre-trained model-based features showed better classification accuracies, both for speech and NSA inputs compared to the conventional features. It was also found that the HuBERT features performed better than the wav2vec2-BASE and wav2vec2-LARGE features.
- Europe > Finland (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > India (0.04)
- Health & Medicine (0.93)
- Government > Regional Government (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.75)